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ABSTRACT 


Announcement and distribution of documents through an automated selective 
dissemination of information (SDI) system is a response to the rapid growth of the 
literature as well as the growth of scientific and engineering staffs. NASA's 
developmental SDI Program, operating under an IBM 7090/94 computer system, 
was initially developed under contract by I BM Advanced Systems Development Division 
and then transferred to operation by NASA's Scientific and Technical Information 
Facility. Over 2000 documents, the full contents of Scientific and Technical Aerospace 
Reports and International Aerospace Abstracts, are compared twice monthly with the 
expressed interests of over 700 participants, located at 10 NASA and 11 Air Force 
centers. Exceptional flexibility is possible in expressing user interests; match options 
include "must" single-word descriptors, two- to seven-word phrases, "not" terms and 
phrases, and percentage (or "may" word) matching, plus a number of special descrip- 
tors, such as contract numbers. In preparing the input document and user interest 
profiles for matching, a dictionary program provides machine-generated codes and can 
substitute cross-referenced descriptors. The computer configuration requires an 
IBM 7090/94 with 32K storage, two IBM 7607 data channels, and eight IBM 729 tape 
drives. System output consists of tab-card-sized abstract cards duplicating the high 
graphic arts quality of photocomposed journal abstracts. A response card provides 
for expression of interest by manual punchout and may be used to request the full text 
of selected documents. Development of the computer program and its exploratory 
operation are described in report NASA CR-62020. Program documentation is given 
in NASA CR-62021. In the present report, operating experience under NASA direction 
is presented, program modifications are described, and availability of the program is 
discussed. In February 1966, the 7090/94 SDI program was replaced by an IBM 1410 
program with planned emulation on an IBM Systems/360 Model 40. 
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NASA SELECTIVE DISSEMINATION OF INFORMATION PROGRAM 

(IBM 7 090/94 SYSTEM) 

By Gifford A. Young 


SECTION 1. SUMMARY 


Selective dissemination of information (SDI) makes use of the capabilities of com- 
puters to manage on one hand the accelerating growth of information and on the other the 
corresponding need of scientific and engineering staffs for selected portions of this in- 
formation. For SDI, computers are programmed to compare the subject indexes of doc- 
uments against the expressed interests of participants in an SDI system in order to 
select appropriate documents for announcement or direct distribution. The NASA SDI 
program (IBM 7090/94 system) described in this report is capable of matching the inter- 
ests of large numbers of users against the contents of large volumes of announcements. 

It permits user interests to be expressed with flexibility and precision, and provides 
notifications of selected documents in the form of exceptionally legible and convenient 
announcements. 

Primary hardware required includes an IBM 7090/94 data processing system 
with 32K storage, two 7607 data channels, and eight 729 tape drives. 

The program was developed under contract to NASA by IBM Advanced Systems 
Development Division during the period of June 1963 through December 1964. 

A description of the system in effect as of December 1964 is given in the following 
reports: 


International Business Machines Corp. , Yorktown Heights, New York 
Implementation, Test and Evaluation of a Selective Dissemination 
System for NASA Scientific and Technical Information. Final Report. 
Jun 1966, 83 pp. (Price $2.50) 

(NASA CR -62020) 

International Business Machines Corp. , Yorktown Heights, New York 
Program Documentation for a Selective Dissemination of Information 
System for NASA Scientific and Technical Information. 

Jun 1966, 223 pp. (Price $3.75) 

(NASA CR -62021) 

Flow charts and operating instructions are given in the second of these reports. 

The program was operated by the NASA Scientific and Technical Information 


Facility, College Park, Md. , under the technical direction of the NASA Scientific and 
Technical Information Division during a transition period from September to December *■ 

1964, and was fully operable from January 1965 to February 1966, During this period, 
significant changes in the operations were made on the basis of experience and user re- 
quests. These changes are discussed in the appropriate sections of this report. 

The present report is addressed primarily to organizations or individuals con- 
templating operation of the system independently. This report: 

• Presents essential information not given in the IBM final report and doc- 
umentation; for example, off-line programs for preparing operating reports. 

• Relates experience gained in actual operation of the system. 

• Documents changes that were made in the program between the completion 
of the IBM final report and the transfer of SDI operations to another com- 
puter system. 

• Advises on preparation of effective user interest profiles. 

• Relates library experience in handling SDI-generated requests for documents. 

• Suggests potential changes in operation and output media that an independent 
operator might wish to put into effect. 

The 7090/94 NASA SDI profile transaction and match programs described in 
NASA CR -62020 and NASA CR -62021 can be obtained from NASA by special arrangement. 
Associated off-line 1401/1410 programs for printing user accessions, document profiles, 
etc., also can be supplied. 

In February 1966, the NASA SDI program was reoriented to take advantage of the 
availability for SDI operations of a different computer, an IBM 1410, and to prepare for 
the installation in the near future of an IBM Systems/360 with 1410 emulator hardware. 

A bibliographic retrieval -SDI program written for an IBM 1410 with 40K memory and process 
overlap and priority features was demonstrated to work effectively; therefore the 7090/94 
program was inactivated. 

Further information about the NASA 7090/94 and 1410 SDI programs and other NASA 
technical information services may be obtained by writing to: 

Scientific and Technical Information Division 
Code USD 

National Aeronautics and Space Administration 
Washington, D. C. 20546 
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SECTION 2. INTRODUCTION 


* 


One of today’s great problems in science and technology is the tremendous out- 
pouring of literature concerning new discoveries and developments. The flood of reports 
is compounded by the gradual elimination of sharp distinctions between disciplines; today's 
scientist or engineer must be alert for related developments in fields other than his own. 

The traditional way of keeping informed, by simply reading the literature, is no longer 
practicable. The individual who depended on his own examination of the vast paper 
avalanche would have no time for anything else. 

For a number of years, a partial solution has been the publication of abstract 
journals which enable the individual to scan summaries of the current literature and to 
select for detailed reading only those documents of clear value to his work. The abstract 
journal is still a basic resource, but the flow of information is rapidly reaching a point 
where scanning abstract journals for current awareness is a time consuming and often 
neglected chore. 

NASA's Selective Dissemination of Information (SDI) program is an approach to 
solving the literature problem by automatically notifying NASA scientists and engineers 
individually of new reports of value in their work (Fig. l). its success has been made 
possible by evolving computer capabilities to select and deliver relevant information 
effectively and economically. 

Although the mechanics of NASA SDI (IBM 7090/94 system) involve a large computer, 
the program provides a personalized service. A participant in the NASA SDI program 
receives daily at his desk a few envelopes, each containing two tab-sized cards, as 
illustrated in figure 2. One of these cards carries a citation and abstract of a report or 
journal article selected by a computer by comparing an "interest profile" written by the 
participant with the subject indexes of all the literature entering the system in the past 
two weeks. This abstract card, which is on white paper for good legibility, can be 
filed for future reference either as a tab card or, since it has a perforated stub that is 
easy to tear off, as a standard 3x5 inch file card. 

The other card, on blue stock, facilitates a response to the system. It has 
prescored blanks that may be punched out with a pencil point to indicate (1) that the 
document is of interest and that the participant wishes to see it; or (2) that it is of 
interest, but that he does not want a copy at the present time; or (3) that he has seen it 
before; or (4) that it is not of interest. (The abstract and response cards are illustrated 
individually and described in detail in Section 7 . ) 

A "comments" space permits the participant to advise the system operator to 
change his address, rewrite part of his interest profile, or suggest any changes that he 
thinks would improve the system. The user is an integral part of the NASA SDI system; 
significant improvements have come about from such user feedback and comments. 
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The NASA SDI system is extensive both in input, which includes essentially all 
the world's unclassified aerospace literature, and in the geographical distribution of its^ 
present 700 participants, who are located at 21 research centers throughout the United 
States. How NASA came to develop such a system can best be introduced by describing 
briefly NASA's overall scientific and technical information activities, with which SDI is 
closely integrated. 

The Space Act of 1958, Public Law 85-568, which created the National Aero- 
nautics and Space Administration, laid out NASA's technical information job in very 
broad terms: To publish and disseminate widely the results of NASA research 

activities and findings. This is done through large-scale publication and distribution of 
NASA in-house (technical notes, technical memorandums, and translations) and 
contractor technical reports. "Repackaging" of scientific and technical information is 
also a major activity. This includes the issuance of project summaries, data compila- 
tions, handbooks, sourcebooks, monographs, technical reviews, state-of-the-art 
surveys, special bibliographies, and literature surveys. Spin-off of aerospace develop- 
ments of value to non-aerospace industry is encouraged through a technology utilization 
program which, as part of its activities, issues technology surveys, reports, notes, 
and NASA Tech Briefs. NASA also has the responsibility of providing worldwide 
research results of aerospace interest and significance to its scientists and engineers 
and contractors. To accomplish this, the Agency conducts an aggressive program to 
acquire the world's aerospace literature, to bring it under bibliographic control, and to 
announce and disseminate it in the shortest possible time to those who need the informa- 
tion. 


To carry out this mission, NASA has established the NASA Scientific and 
Technical Information Facility, operated by Documentation Incorporated under the 
technical directions of NASA's Scientific and Technical Information Division. The 
Facility acquires the aerospace report literature, processes it into various eye-legible 
microdocumented, or machine-readable forms, and announces it in a semimonthly 
abstract journal, Scientific and Technical Aerospace Repor ts (STAR ). 

Similarly, the American Institute of Aeronautics and Astronautics, New York 
City, by a cooperative arrangement with NASA, acquires the world's formally published 
aerospace literature — that appearing in journals, books, and conferences. AIAA pro- 
cesses this information in the same manner that NASA's Facility processes the report 
literature, and announces it semimonthly in an abstract journal, International Aerospace 
Abstracts (IAA ). Citations and indexes of all announcements in both STAR and IAA 
are also distributed on machine-readable tapes to NASA research laboratories and 
major NASA research and development contractors. 

Each semimonthly issue of STAR and IAA contains more than 1000 abstracts. 
During calendar year 1965, the two journals together announced in excess of 50, 000 
items. To meet the information needs of the busy scientist and engineer, it would 
clearly be desirable to bring to his attention only the relatively few announcements that 
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are related specifically to his interests. Fortunately, all the necessary ingredients for 
'such selection — except the computer program — become available as part of the normal 
production of STAR and JAA and related bibliographic and information services: abstracts, 
indexes, tapes, and full texts of announced documents. 

In 1963, a proposal was made to NASA's Scientific and Technical Information 
Division by IBM's Advanced Systems Development Division, Yorktown Heights, New 
York, to conduct a developmental study of an advanced, large-scale SDI system that 
would take advantage of the speed and memory capacity of an IBM 7090/94 computer. 

As finally accepted, this study involved the selection of announcements from STAR only, 
the participation of 500 NASA scientists and enginers located at NASA Headquarters 
and eight (now ten) research centers, a ten-months' test of the system (November 1964 
to August 1964), and four months' evaluation (September to December 1964). 

During this time, changes were continually being made in the selection techniques, 
either as experiments or as the computer programmers added new features. As might be 
expected, this caused the number of announcements and their relevance to vary sharply. 
Yet the participants came more and more to accept SDI as a satisfactory information 
service, some of them even abandoning the use of the abstract journals and depending 
on SDI exclusively. 

In June 1964, the membership increased when the U. S. Air Force asked to add 
200 of its personnel to the NASA program for several months to gain experience with an 
operating system. Interest has been so high that this participation has continued to the 
present. These 200 Air Force engineers, scientists, and administrators are located 
at 11 U. S. Air Force bases and research centers. 

As the developmental contract with IBM's Advanced Systems Development 
Division was to terminate December 31, 1964, transfer of SDI operations to the NASA 
Scientific and Technical Information Facility took place during September to December 
1964. During this period and during its subsequent independent operation of the 7090/94 
program, the Facility continued to provide service to about 700 users. A number of 
changes were made in the program at NASA direction. These are discussed in the 
following sections of this report. 

A change of particular note occurred when searching of the full contents of IAA 
was initiated in November 1964. This doubled the literature input to the system and the 
number of announcements selected and forwarded to participants created additional 
pressure on library staffs, especially since the new announcements were journal 
articles, books, and conference papers — documents less easily accessioned and 
distributed than reports. However, it has benefitted the individual user in broadening 
the scope of items selected for him. 


Program reorientation . All aspects of the program described in this report 
have been under continuous evaluation, with full recognition that effective operation of 
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the NASA SDI program is not restricted to a particular computer configuration, nor to a^. 
particular form of announcement. Operation and further development of the program, 
which is based on an IBM 7090/94 program written during 1963 and early 1964, has 
taken place during a period of rapid changes in computer technology. New series of 
computers are being installed widely; of particular relevance, NASA's Facility is 
preparing to install an IBM Systems/360 Model 40 in July 1966. 

On the basis of the directions being taken by computer technology and consideration 
of the practicality of in-house operation of the program, a decision was made in late 
1965 to replace the IBM 7090/94 SDI program with one intended to operate on the IBM 
Systems/360 Model 40. In further consideration of the desirability of immediate trans- 
fer of operations to gain more convenient operator control of the program — the IBM 
7090/94 program required the cooperation of a NASA research center for the availability 
of time on an IBM 7094 computer that is usually fully occupied with scientific and 
technical computations — it was decided to convert as soon as possible to an SDI program 
written for an IBM 1410 computer. This computer was more readily available than 
the IBM 7094. Emulator hardware for the IBM 1410 will be installed on the planned IBM 
Systems/360 computer. A bibliographic search program written for an IBM 1410 with 
40K memory and process overlap and priority features was modified for SDI use. The 
revised program was demonstrated to work effectively; therefore, the 7090/94 program 
was inactivated in February 1966. 

The 7090/94 program will be made available on request to organizations 
interested in studying this particular SDI system. The programs, documentation, and 
associated 1401 and 1410 programs described in this report can be supplied by special 
arrangement with NASA. Program maintenance would be the responsibility of any 
organization receiving it and no guarantee concerning its operation can be made. 

In addition to the conversion of computer operations to a different computer, 
the form of the SDI announcement has also been changed. Users of the NASA SDI 
service now receive a computer printed listing of citations rather than the abstract 
cards described in this report. The current NASA SDI program thus differs from that 
described in this document and in the associated reports NASA CR -62020 and NASA 
CR -62021. These three reports are published as a record of a unique SDI system and 
a stage in the development of selective dissemination of information systems. 
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SECTION 3. DOCUMENT PROFILES 


The NASA SDI system matches representations of participants' interests (user 
profiles) against representations of the subject content of reports and published literature 
(document profiles). The format in which the document profiles enter the system is a 
critical factor in programming and operating the system. For projected independent 
operation of NASA SDI systems, knowledge of the identifiers, document codes, and 
descriptive cataloging details that appear on the input tape is essential for understanding 
the selection options available to an SDI operator. (The document representation used in 
the match program comprises only a portion of the full input information concerning each 
document.) Detailed awareness of the document input contributes to (1) understanding the 
operation of the NASA SDI program and (2) possible modifying the program to suit 
specific interests. 

Input file format. Document input is a computer tape having a linear file format 
and containing full citation and indexing (but not abstracts) of all documents that have 
been selected for appearance in the next forthcoming issues of STAR and IAA . 

No information concerning document profile input is given in the IBM program 
documentation, which states only: "This file is provided via other NASA operations 
from which its format can be obtained, " (NASA CR-62021, p. 12). The document tape 
format is presented in full in the following publication: 

Guide to the Processing, Storage, and Retrieval of 
Bibliographic Information at the NASA Scientific and 
Technical Information Facility 
June 1966 149 pp. (Price $3.25) 

(NASA CR-62033) 

The document tape has been formatted primarily for use with the variable-length- 
field IBM 1401 and 1410 computers; therefore, the input tapes as distributed by the NASA 
Facility require reformatting for use with an IBM 7090/94 computer. The program is 
available as part of the NASA SDI package. 

Document profile processing . The document input tape is processed by a 
subroutine GTDOC, which is flow-charted in NASA CR-62021, pp. 52-7. During 
operation of the program, the processing of the title field and the personal author field 
described in the latter were deleted. 

Initially, the title field (actually the first line, equal to 50 characters or less, 
of the document title) and the personal author field on the tape were extracted by GTDOC, 
coded, and added to the dictionary. The truncated title (in effect a multiword phrase 
which frequently contained articles, prepositions, and odd combinations of numbers, 
acronyms, etc.) and the personal authors (at that time treated as a two- or three-word 
phase consisting of surname and initials) were then able to match against user profiles 
and generate announcements. 
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Matching against a truncated title was intended to permit a possibly valid hit that' 
could not have occurred by a match against index descriptors alone. This concept was 
related to the philosophy of entering user profiles in the user's unedited words. In 
order to eliminate some commonly used words having no information content yet adding 
to length of phrases and possibly permitting false matches, the following 44 articles and 
prepositions were automatically eliminated from both document and user profiles (NASA 
CR -62021, p. 40). 


a 

as 

about 

at 

above 

back 

after 

below 

ahead 

but 

along 

by 

amid 

down 

among 

for 

an 

from 

and 

in 

apart 

into 


like 

round 

near 

since 

of 

the 

off 

to 

on 

under 

onto 

until 

or 

unto 

out 

up 

over 

upon 

past 

via 

per 

with 


A few of these words can appear in meaningful expressions (e.g. , back injury , near 
infrare d) so that their rejection could lead to occasional "misses. " 

The concept of truncated titles was dropped as an unnecessary complication and 
as being inconsistent with the policy of using a controlled vocabulary in writing user 
interest profiles. It is described here only to document the discrepancy between the 
description of GTDOC in NASA CR -62021 and the actual program. It will also suggest 
the flexibility of the program; titles and other fields on the linear file could be entered 
as match possibilities by operators of independent or decentralized systems. 

The names and initials of personal authors at one time were matched as if they 
were subject descriptors, as mentioned in a preceding paragraph. This resulted in 
curious false matches, such as announcement of a totally nonrelevant report to a 
meteorologist because the report author's name was Snow , or an equally meaningless 
announcement to a profile on machining because one of the authors had the name Hammer . 
Because of lack of interest, and to save computer time during the MATCH program, 
selection on author's names was dropped from the 7090/94 NASA SDI operation. It 
could easily be reinstated if desired. If personal authors were to be reincluded as 
match options, their names would be concatenated in the manner described in NASA 
CR-62020, p. 14, as the computer program has been modified. 

Index i ng practices . Effectiveness in matching documents against user profiles 
ultimately is dependent on the subject index terms making up the document profile. The 
index terms assigned to documents are assigned for another purpose than SDI; namely, 
bibliographic retrieval. Indexing is done in conjunction with abstracting. Ten to twenty 
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‘subject terms are assigned to each report by trained literature analysts. The index 
terms then are written on magnetic tapes. 

Index terms are referred to as either "machine terms" or "published terms." 

The latter are those that appear in the book -type index that is published in each 
issue of STAR and IAA and is cumulated quarterly and annually. The other index 
terms are used only in bibliographic searching by a machine (computer). Of the ten 
to 20 index terms assigned to a document, only three or four will be used as published 
terms. Another distinction is that published terms are frequently multiword descriptors, 
while machine terms are commonly single words. 

In assigning index terms, indexers are restricted to a controlled vocabulary 
which may be changed only by approval of a vocabulary edit group. This vocabulary 
consists at present of about 19, 000 machine and published terms combined. 

Reports and journal literature are indexed according to the same standards and 
the same vocabulary, although processed by different organizations. For both areas, 
citation tapes having identical formats are prepared and distributed. The full range of 
aerospace literature found in STAR and IAA therefore is available in machine readable 
form. 


The operator of an independent SDI system will frequently have reason to examine 
the indexing assigned to documents, particularly in reviewing the announcements received 
by a participant. A printout of all indexing terms that took part in the match of a given 
issue is provided by a 1401 print program from a tape obtained during the 7090/94 
computer run. This document profile printout is described in Section 6. The following 
comments on indexing practices will help explain the contents of the document profiles: 


A number of common adjectives, such as high and low , are always 
precoordinated with the terms they modify. Proper names, in such expressions as 
Einstein law , and naturally combining forms, such as Mariner II Spacecraft , also 
are precoordinated. Otherwise, all words in the phrases listed should also appear as 
separate words in the document profile, indicating that they are also assigned as 
machine terms as well as published terms. Practice varies to some extent, however. 
For example, an adjectival form in a phrase might occur as a noun form in a machine 
term: e.g. , Magnetic effect and Magnetism . Usually, both Electric stimulus and 
Electric , not Electricity , will appear in this printout. 

The distinction between machine terms and published terms is not meaningful 
in the 7090/94 SDI program. All descriptors — machine, precoordinated, or published — 
are disassociated into their unit word components. In reviewing the document profile 
printout, it is important to recognize that matches can result from combinations of any 
words in a profile, whether a word appears singly or as a component of a multiword 
descriptor. 
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Index terms are assigned in the singular unless the plural form is signifi- 
cantly different in meaning; e. g. , plastic and plastics are separate index terms. 

Hyphenated descriptors are concatenated by the computer and treated as 
single words. About 450 hyphenated expressions are found in the machine term 
vocabulary. These include such combinations as all-weather . Baker-Nunn . B-70 . etc. 
The component parts of hyphenated expressions cannot match by themselves; e.g. , the 
word weather in a profile cannot be matched in any way to the word all-weather . 

Such expressions as M- 1 (note space) are not treated as hyphenated . 

Their presence is the result of allotting several digits to allow for future growth in 
series designations. In writing profiles, the system operator should be careful that 
such expressions in the user’s profile correspond exactly to the machine term 
vocabulary. An M-l in a user profile will not match an M- I in a document profile. 

Depth of indexing 

The number of index terms is important because (1) the greater the number of 
terms the more likely a match can take place, and (2) the greater the number of terms, 
the more machine time will be required. The document profile printout lists the number 
of index terms for each journal issue. Table I gives the average number for STAR and 
IAA issues processed during the first half of 1965. 


TABLE I 

Average Number of Indexing Terms (1965) 


IAA STAR 


Issue 

Average 

Issue 

Average 

1 

16.9 

1 

14.1 

2 

16.7 

2 

14.7 

3 

16.9 

3 

14.5 

4 

16.0 

4 

13.4 

5 

15.8 

5 

13.8 

6 

15.2 

6 

15.3 

7 

15.7 

7 

15.3 

8 

15.6 

8 

15.4 

9 

15.6 

9 

15.8 

10 

16.1 

10 

15.8 

11 

15.2 

11 

16.1 

12 

15.5 

12 

15.8 
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These figures include subject terms and contract numbers where present and 
• the expression Foreign-lang discussed in Section 4. Correction for these descriptors 
would not reduce the averages significantly. 

As the required machine time, particularly the match time, depends on the 
number of terms in the profiles, any increase in the number of index terms would 
increase the computer time and cost. Index terms that might be added include authors, 
corporate sources, and other machine-sortable items from the citation tape input. 

Classified input 

At present, all announcements distributed under the NASA SDI system refer to 
unclassified documents with unlimited distribution. The announcements themselves are 
also, of course, unclassified. To the operator of an expanded SDI system, security- 
classified input material would be available if he were authorized to receive CSTAR and 
corresponding tapes. These are in the identical format of STAR and IAA . Prescribed 
security regulations would need to be implemented. 

Distribution of announcements selected from classified sources presents a 
problem: some of the announcements themselves may be classified, requiring that 
the abstract cards be handled as a classified document. In an expanded system, provi- 
sion could be made for the use of input tapes from which all citations had been stripped 
that were themselves classified. 


11 


SECTION 4. USER INTERFST PROFILES 


User interest profiles must be carefully structured to take full advantage of 
NASA SDI system capabilities. This section is intended to clarify the various options 
open to the profile writer and offer some practical suggestions for constructing an 
optimum profile. 

Although no particular form is essential in submitting an interest profile, the 
prospective user might find a properly organized submission form of value. Figures 
3 and 4 illustrate a form that has been developed for this purpose. Spaces are provided 
for job and organizational information, of great value to the system operator in reviewing 
the profile for adequacy of interest representation. Grouping of spaces for the various 
types of possible profile content, together with brief notes concerning these, is intended 
to guide the user toward a well-balanced profile. 

Interest profiles received by the system operator must be keypunched according 
to the card format prescribed in NASA CR-62021, p. 17, with care being taken that 
descriptor usage codes are used properly; i.e. , M is used only after a must single 
word, the same space (Column 70) is left blank after a positive phrase or a may word, 
etc. As column assignment is relatively simple, any 80-column coding form may be 
used for instruction of the keypunch operator. In present practice, coding is done by 
a vocabulary specialist during his detailed review of the submitted interest profile. 

This assures that the profile written on the update tape will consist of descriptors 
actually used in report indexes as well as representing the user's interests. 

Preparation of an interest profile takes thought. Periodic refinement and up- 
dating also are essential to assure good service. 

Briefly, an interest profile consists of term, phrases, and negations that add up 
to a representation of the individual's interests. The following matching capabilities 
are available: 

"Must" terms. Simplest type of matching between an interest profile and 
document indexes is accomplished on profile terms (single or hyphenated words only) 
to which an "M" (for "must") has been prefixed. Every document having that term in 
its index will be announced, unless barred by a "not" term or "not" phrase in the 
same profile. For example: 


User Profile 


Document Profile 

• • • 


« « » 
Apollo 
Lunar 

• • « 

• • • 

M Apollo 
• • • 



■ r 


• • • 
Landing 
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Inclusion of "must" terms can be beneficial if used with discretion. The possibility of 
receiving much extraneous material is obvious if frequently assigned index terms such ' 
as rocket or missile are listed as musts. On the other hand, the use of carefully 
selected must terms can help assure that all pertinent reports are announced. 

Generally the more specific the term, the greater its potential value as a must term. 
Index terms such as Concorde , Gemini, or Kiwi would be more appropriate for 
musting than would words of more general meaning. Musting such specific terms 
rather than entering phrases covering a general topic can even reduce the number of 
no-interest announcements. Words of many meanings, such as echo , precipitation , 
and mercury should preferably not be musted, unless suitably restricted by "not" 
terms or phrases. 

Usually, whether a given term should be musted can easily be decided on 
the basis of the participant's or profile reviewer's experience of how frequently such a 
term is used or how widely it varies in meaning. In case of doubt, the Subject Authority 
List should be consulted (see Profile writing techniques in this section). 

Phrases . The best way to reduce the flood of announcements that might result 
from musting certain words is to write the latter into phrases. A two-word phrase in 
a profile will cause a match if, and only if, the two words appear in the document index. 
A phrase containing three words requires the presence of three index words, as shown 
in the following example: 


User Profile 


Document Profile 

• • • 


Apollo 

• • • 


1 

Heating 

Spacecraft reentry 



Landing 

heating 


» 

Reentry 

• « • 


> 

Spacecraft 

• • • 


Thermodynamics 


The phrase matching capability gives the NASA SDI system a broad 
flexibility and provides the user with the capacity for creating a tight profile. Phrases 
both (1) assure announcement of the topic represented and (2) reject nonrelevant 
information (since no match is possible unless all the terms in the phrase are present 
in the index). Phrases, in effect, are treated as if each one were a separate profile 
by itself, although they may be overriden by "not" terms and phrases. Some features 
of phrase matching are worth considering: 

A two-word phase that gives too broad coverage may be made more 
specific by conversion into one or more three-word phrases. For example, heat 
transfer is a two-word phrase. If it were in a profile without further qualification, 
the participant might receive 500 announcements a year from this phrase alone, on all 
the conceivable topics to which heat transfer may be related. Announcements could 
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.easily be limited more closely to those of real interest by rewriting this as convective 
heat transfer , reentry heat transfer , etc. , as his interests dictate. 

■ A three-word phrase should not be written into a profile without 
deleting or avoiding all two-word phrases containing any two of its component words. 

No restriction of unwanted items would be gained in such a case, as the three-word 
phrase would be disregarded. 

All phrases have equal weight in leading to matches. It would be point- 
less, for example, to prefix certain phrases by an "M" in an attempt to indicate that 
these were of more significance. Another way of expressing this is to say that "All 
phrases are musts. " 

A phrase need not be meaningful as a grammatical expression, nor a 
logical one. Phrases in fact are simply "clusters" of single subject index words and 
are sometimes referred to by that expression. Phrases such as liquid rocket engine 
or manned heat shield are perfectly legitimate, and, as seen below, may be preferable 
to writing longer phrases that include frequently used index terms such as propellant 
and spacecraft . Order of words in phrases is not significant. 

Seemingly redundant phrases may be included to ensure receiving 
announcements of reports indexed to near-synonyms. 

Adding more words to form longer phrases would ordinarily restrict 
more and more the announcements that would be selected. But the question arises, 
would the indexer be likely to assign so many words, identical to those in a long 
phrase, in indexing a report, even one containing pertinent information? The like- 
lihood that the indexer would use a near-synonym for one of the words in a seven- 
word phrase or use only six of the words is obviously very great. In order to increase 
the chance of receiving an announcement of such a pertinent document, longer phrases 
are matched in this way: any combination of terms taken three words at a time 

out of a four- or five-word phrase, four words out of six, or five words out of 
a seven-word phrase. For example: 


User Profile 

t • • 



Document Profile 

• • • # 

• • ♦ 



♦ • ♦ 
Boundary 

• • • 



Friction 

• • « 

• • ♦ 

Compressible turbulent 


— 1 • 

boundary layer skin 


... — 

Layer 



friction 

Turbulent 



• • • 



• • • 
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In the example, the document will be announced even though the index terms compressible 
or skin had not been assigned by the indexer. 


Care should be taken in writing longer phrases, however, as this combinational 
type of matching can occasionally lead to receipt of notices diametrically opposed to the 
subject of interest. In cases of doubt, a word or two can usually be eliminated to back up 
to a three-word phrase. For example, the four -word phrase, manned spacecraft heat shield, 
would lead, because any three words are matched, to announcements of reports of heat 
shields on both manned and unmanned spacecraft. It might profitably be converted to the 
three-word phrase, manned heat shield. "Not" terms or phrases might also be used to 
reduce announcements of unwanted items. 


Percentage matching . The capability of matching by percentages of the number of 
single terms is an evolutionary survival of the earliest days of NASA SDI development, 
when the participant's interest profile consisted merely of a listing of single terms, cor- 
responding in appearance to the document profile (NASA CR-62020, p. 16). Matching was 
statistical; a certain percentage (say 17%) of the terms in the shorter of the two profiles 
was selected on the basis of past experience; if the resulting number of terms in the interest 
profile was identical with the same number of any of the terms in the subject index, a match 
occurred. The number of false associations tended to be high if the percentage were set 
low, in which event any two or three words might match. This matching capability was 
later subordinated to the more flexible matching provided by phrases, must terms, etc. 

In percentage matching, all the phrases are disassociated into single words. The latter 
are then taken in combination with any single words in the unaltered profile (these are known 
as "may" words) and percentage matched against the report indexes. The percentage is 
set high, however. At present, 50% of the number of terms in the shorter profile, usually 
the subject index of a document, are required to match before an announcement is sent out. 


User profile 
(converted) 


Cryogenic 

Gravity 

• • • 

Propellant 

Transfer 

Zero 



Document profile 


Cryogenic 

Expulsion 

Fuel 

Gravity 

Liquid 

Pressurization 

Propellant 

Transfer 

Zero 


As the match factor is set so high, a large number of words must be identical for a match 
to take place. Therefore, if a match occurs it is likely, in theory, to be of specific interest. 
A possible advantage of this form of matching is that it gives the participant a chance to 
get an announcement of a pertinent document that could not have been announced to him by 
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. phrases or other features of his profile. Reports on the reasons for matches, see 
section 6, indicate that only about 3% of all matches result from this option. Further- 
more, since in the match process the percentage match follows the match for must terms, 
but precedes the match against phrases, many announcements attributed to a "may" 
match would have been selected by a profile phrase in any event. Probably only 1% 
of all announcements are uniquely selected by percentage matching when the minimum 
for a match is set at 50%. 


The program could be modified so that a given number of terms, say 4 or 5, 
can be required to match rather than a number determined for each document by a 
specified percentage of terms. This type of matching was tried for several months. 
Results in number and relevance of announcements were not significantly different from 
a percentage type match. 


" Not" terms . If an independent word in an interest profile is preceded by an "N" 
the computer will not send an announcement of any report whose index contains that 
word, no matter how good the match is otherwise. The "not" indication overrides all 
other instructions, even the "must. " This capability is very useful where a must term 
or short phrase best expresses a participant's broad interests, yet he does not wish to 
receive certain types of announcements. For example, the following profile suggests 
that the participant is interested in the subject of loads on spacecraft, launch vehicles, 
satellites, etc. , under all possible conditions of transportation, staging, launching, 
docking in space, etc. Rather than write phrases for all the multitude of possible 
relationships, he has musted the word "load. " However, he is not interested in loads 
on aircraft or helicopters, therefore he has indicated these as "not" terms. 



Subject Index 

• • • 

• • • 

• • • 

Aircraft 
• • • 

Load 


He would ordinarily have received an announcement of the report indexed in the 
example, but the negation before the term "aircraft" has properly barred this. 


"Not" terms must be used with care, after due consideration of the possibility 
that information of interest might be included in a report with information that the 
participant does not want. In the example, if information concerning loads on both 
aircraft and spacecraft were discussed in a single report and indexed to these terms, 

the participant would not receive a notice of this report, although it would indubitably 
have been of interest. 
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" Not" Phrases . Just as the wide range of announcements selected by a "must" 
term can be narrowed by converting the must term into a phrase, a "not" term can be 
made more specific in forestalling unwanted announcements by changing it to a "not" 
phrase. An "N" preceding a phrase will override all matching possibilities if all the 
words in the "not" phrase appear in the subject index. The participant in the following 
example is interested in meteorological aspects of rain, not in the erosion of aircraft 
surfaces caused by rain. He therefore "musts" the word rain, but "nots M the phrase 
aircraft rain erosion. The likelihood that a report whose announcement was barred by 
this phrase would also contain significant meteorological information is clearly smaller 
than the likelihood that a report barred by simply negating the single term "aircraft" 
might contain information of interest. 


User Profile 

• • • 

M Meteorology 
M Rain 

N Aircraft rain 
erosion 



Document Profile 


Aircraft 

Erosion 

Rain 


Phrases, as well as must terms, may be modified by adding not phrases to the 
profile. It is very important to note that a not phrase should be longer than any positive 
phrase made up of its component words. 


Not phrases of 4 to 7 words operate in the exact inverse of positive phrases of 
the same length. That is, a four-word not phrase will stop an announcement if any 
three of its words appear in the document index. 


Contract numbers . Incorporating contract and grant numbers into an interest 
profile can help assure that announcements in certain specific areas will be received 
regardless of failure to match through index terms. Conversely, including contract 
or grant numbers as negations can reduce the number of announcements, as by stopping 
announcements that a participant might be getting as part of his official duties. The 
more specific the contract or grant, the more specific the announcements. Contracts 
covering interdisciplinary studies should only be included with full realization that 
many documents far from a participant's interests may be announced to him. 



Document Profile 


• • • 

NGR-12 -001-010 
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Contract numbers are special descriptors. In printouts of user profiles, they are 
* preceded by a "C" if a must term, a "D" if a not term. They must be entered into 
the user profile in exactly the same standardized form in which they appear on the 
input tape. 

Miscellaneous match options . In response to relatively frequent requests from 
SDI participants to avoid announcements of documents in a foreign language, the 
option of "notting" such documents was added. Translation services are plentifully 
available throughout NASA; the primary purpose of such requests was reduction of total 
announcements by eliminating an area of information that might be called to users' 
attention by other means. Adding the expression Not Foreign-Lang to an interest 
profile reduces the number of IAA announcements by approximately one-third. 

Programming this option was accomplished by reading blocks 45-46 on the 
input tape (see Sect. 5). These provide two digits for the language of the document: 

01, English; 02, mixed languages as in some conference reports; 12 to 98 other 
languages. The program adds Foreign-lang to the document profile if any digits except 
01 and 02 are present in this block. This term on the document profile is thus treated 
as an ordinary descriptor, equivalent to a subject term. 

Profile writing techniques. Writing a good interest profile can be approached 
systematically with the aid of two publications. The first. Guide to the Subject 
Indexes for Scientific and Technical Aerospace Reports, SP-701G (Rev 3) 1 , lists 
the most common vocabulary terms plus numerous cross references (figs. 5 and 6). 

The profile writer may use this guide to assist his memory as he makes a tenative 
list of single words and phrases to describe his interests. He will find the numerous 
phrases listed to be particularly helpful in creating an optimum profile. He may enter 
phrases as listed or may rewrite them into longer phrases to obtain greater specificity. 

The single words tentatively written down as subjects of interest should next be 
considered as to the advisability of (1) musting them, (2) musting with addition of not 
terms and/or not phrases, or (3) writing into phrases. The optimum choice can some- 
times be decided by the participant from his own experience, but it will often help to 
consult the Subject Authority List . This publication, which is a 500-page computer 
printout, is updated and distributed monthly to all recipients of the NASA linear file 
tape. A sample page is shown in figure 7. The columns show the number of documents 
to which the particular index term has been assigned in each year (2 = 1962, 3 = 1963, 
etc. ) and journal (A = IAA , N = STAR. X = CSTAR). An estimate can thus be obtained 
of the number of announcements that might be received if a particular word were musted 
without qualification. 


Distributed to all recipients of Scientific and Technical Aerospace Reports 
(STAR ). It may also be purchased from the Clearinghouse for Federal Scientific and 
Technical Information, Springfield, Va. , 22151. Price $3.75. 
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Many less commonly used index terms may be discovered in the Subject Authority List. 
Their use in interest profiles instead of, or in addition to, descriptors of more general 
meaning can be beneficial in assuring announcement of documents of specific interest. 

The overall number of selected announcements will not be increased significantly since 
the terms are infrequently posted. 

In general, words that are not in the Subject Authority List should not be used in 
the interest profile, either as independent words or in phrases. However, new terminology 
that might reasonably be expected to be added to the vocabulary in the future may be 
included. How these not-in-vocabulary (NIV) terms are handled is discussed in Section 
5. 


Of utmost importance is to avoid writing an initial profile of too great length, 
especially one which attempts to cover many broad subject areas. If the NASA SDI 
participant begins with a limited profile, he can understand the reasons for the announce* 
ments he receives and recognize changes that need to be made to improve his profile. 
Furthermore, he can readily add the proper combination of terms and phrases to 
receive announcements in new' subject areas. On the other hand, if he attempts to begin 
receiving service with a profile hundreds of lines long, into which he has tossed any 
words and short phrases that have caught his attention w'hile looking through the Guide to 
the Subject Indexes for STAR, he will be overwhelmed by hundreds of announcements, 
most of which will be unrelated to his real interests. Such a situation will be almost 
impossible to correct by less than the most drastic profile pruning; in effect, by starting 
over. If feasible, profiles should be written during a joint discussion between participant 
and profile reviewer. 

Profile reports . In order to assist the participants in evaluating and modifying 
their current interest profiles, the latter are printed and distributed at regular intervals. 
These and other user profile reports are discussed in Section 6. 


Profile Review . In the present NASA SDI program, it is felt essential that 
all new interest profiles and profile change be reviewed by professional vocabulary and 
indexing specialists who are also thoroughly familiar with the workings of the system. 
These profile reviewers have available such tools as the Subject Authority List. In 
addition to new profiles, the reviewer will regularly select individual profiles for detailed 
review. A profile to be examined may be that of an individual interested in obtaining 
improved service, or that of a participant whose response record has shown a low per- 
centage of hits. (Evaluation of responses is discussed in Section 9. ) 

Profile review may be undertaken by simple consideration of each line of the 
profile in light of the participants known interests. Another powerful method of profile 
improvement is consideration of the subject indexes for specific documents that were 
selected by the individual's profile but which he has indicated are of no interest. 
Comparison of the subject indexes, which are available as a computer printout following 
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each run (see Section 6), with the interest profile will perhaps show that a particular must 
‘term is actually of broader interest than presumed or that a two-word phrase inevitably 
will lead to many false associations. (An example of such a phrase is Liquid gas; both 
Liquid and Gas might occur together as terms in many report indexes and cause a 
match, but the document itself might have nothing to do with liquified gas. ) The reviewer 
can decide, by considering all the index terms, whether to write the offending term into 
a phrase, or the phrase into a longer phrase, or to modify its selection power by adding 
one or more not terms or phrases. This method is particularly useful in choosing not 
phrases, since words that might be used to bar unwanted announcements will be found 
to repeat themselves in a number of the indexes of no-interest documents. 

A definite trend during developmental operation of the program has been the 
increasing initiative assigned to vocabulary and indexing specialists in revising profile 
structures. Significant changes are, of course, brought to the user's attention. Ideally, 
a profile should combine the experiences of both the participant and the vocabulary 
specialist. 
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SECTION 5. COMPUTER SYSTEM 


Two major computer programs comprise the NASA SDI system. The first program, 
vocabulary control or VOCON, updates or originates the user profiles and vocabulary 
control guide (the "dictionary") and edits both user and document profiles against the 
guide. The second program, MATCH, compares document and user profiles, and 
generates announcements when the designated match criteria are met. Both programs 
require an IBM 7 090/94 data processing system with at least eight tape drives. 

Documentation of these programs, consisting of operating instructions, record 
formats, and detailed flow charts, is presented in NASA CR-62021. The present report 
need only comment on the relation of input (user and document profiles) to system design 
and provide operating experience (run times) not given in the program documentation. 

Vocabulary control (VOCON) . Vocabulary control refers to the necessity of 
having user profile and document index utilize the same words so that a computer match 
between them is possible. In the NASA SDI system, vocabulary control is accomplished 
by a vocabulary control guide, also called a dictionary, which consists at present of 
single-word terms, although longer expressions (phrases) up to 120 characters can be 
incorporated. Terms are added to the dictionary whenever the computer finds them in 
a document index . In addition, terms may be added or deleted by the system operator. 

An equate function allows the operator to designate selected vocabulary terms as 
synonyms or as trouble terms. 

The dictionary and its uses in vocabulary control is discussed in NASA CR-62020, 
pp. 11-14, and NASA CR-62021, pp. 11-12. Figure 8 illustrates the appearance of a 
printout of the dictionary for the latter format. Left to right, the columns are: 0 ) an 
alphanumeric descriptor (only single words are being added currently to the dictionary), 

(2) its coded value as a primary descriptor, (3) the coded value of its corresponding 
secondary descriptor (if this were zero, the descriptor would be a trouble term; if not 
zero, but different from the preceding column, it would be a synonym), (4) date descriptor 
was first added to the dictionary, either the date it first appeared in the document profiles 
or the date it was first added by the operator, (5) date descriptor first appeared in a user 
profile, (6) number of times that descriptor has appeared in document profiles, and (7) 
number of times that descriptor has appeared in user interest profiles. As may be seen, 
few authors appear in user profiles, few phrases in document profiles. Personal authors 
and multiword descriptors were not added to the dictionary during the later stages of the 
system operation. 

This dictionary printout was accomplished by a 7090-1401 utility program which 
is available as part of the general NASA SDI computer program. Vocabulary control 
by use of the dictionary was not utilized in the operating system except for rejecting 
words not in the dictionary. This consisted usually of blocking the addition of misspelled 
or improperly constructed words. For example, if an attempt was made to add a plural 
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descriptor (e. g. , Rockets) to a user profile, whereas only the singular (Rocket ) had 
been approved for indexing use, the vocabulary would reject the term. A change in the 
philosophy of user -operator relations in control of interest profile content was responsible 
for not making greater use of this exceptionally powerful tool. Initially, addition of an 
interest profile to the SDI system was intended to be accomplished with little or no 
editorial review; it was to be essentially a clerical operation. A computer equate 
function would, therefore, have been necessary to convert the user's words to those of 
the approved vocabulary, plurals to singular, adjectives to noun form, misspellings to 
correct spelling, etc., as described in a hypothetical example in NASA CR-62020, p. 12. 

Although this automatic system was tested and found to work satisfactorily, it 
was never put to practical use as a part of the overall SDI operations. Instead, skilled 
vocabulary control personnel were very early assigned to review and editing of incoming 
user profiles as well as to user-indicated changes. All terms and phrases in the user 
profile were checked manually against the Subject Authority List before being entered 
on the keypunch instruction sheet. 

User profile rejection. A certain minimum percentage of the descriptors in a 
user profile must be accepted, that is, be identical with a term in the vocabulary, or the 
profile will be rejected. The minimum percentage is entered by the operator on control 
card MINU (NASA CR-62021, pp.7, 15). During most of the operating period, the 
minimum was set at 50 percent. User numbers of rejected profiles appear on the print- 
out of user transactions, while the total number of rejected profiles is tabulated in the 
program activity summary (see Section 6). 

As with certain other features of the program, profile rejection by the computer 
in this manner presupposes the entry of unrefined user profiles. If profiles are entered 
only after thorough review by vocabulary specialists, rejection does not occur. 

NIV (Not in Vocabulary) profile terms. NASA SDI participants are encouraged 
to use currently developing technology in writing or changing their interest profiles. 

New project or equipment names, in particular, should be entered as soon as the 
interested participant becomes aware of their existence, even before reports or articles 
are received by the journal abstractor-indexers. Presence of the new terms, which will 
likely be approved for indexing use, helps assure notification of these new documents. 

Handling of NIV terms during processing of new interest profiles or changes to 
existing profiles presents a problem under a policy of restricting the size of the SDI 
dictionary. NIV terms are therefore handled in the following manner: When the profile 
reviewer checks a term on a new profile or in a user’s instruction to modify an existing 
profile and recognizes that it is not in the current Subject Authority List, he codes the 
term or its associated phrase on a keypunch instruction form just as if it were to be 
added to the profile, but across the form he will write "Pink Card. " When the resulting 
card on pink stock is returned from keypunching and interpreting, the reviewer will 
file it manually by the NIV term in a small tickler file. If the NIV term is in a phrase, 


24 


. cards for both the individual term and the phrase are keypunched and filed. When the 
profile reviewer then receives a listing of newly approved vocabulary terms (see Section 
3), he checks each approved term against the NIV file. If pink cards related to the term 
are found, they are pulled, the participant's name is observed, and his current profile 
is reviewed to assure that changing interests have not deleted the NIV term or phrase. 

If not, the cards are entered into the next user transaction run. Timing of term 
approval and distribution of authority lists is such that a word approved for indexing 
a new document in process can be added to a user's profile in time to take part in 
matching against the same document. 

Users should be cautioned that unapproved terms or phrases will not be printed 
out on the periodically distributed profiles. Until entered from the above described NIV 
suspense file, these terms and phrases do not occur in the historical profile tapes from 
which the profiles are printed. 

Match program. The document profile tapes and user profile tapes that have 
been coded by VOCON are compared by the MATCH program, which is described 
very briefly in NASA CR -62020, p. 18, and in detail in NASA CR-62021, pp. 182-202. 

The format of the coded profiles is given in NASA CR-62021, pp. 12-13. 

The MATCH program is a straightforward procedure. Each document profile is 
compared in turn with each user profile in the following manner: The user profile is 
searched for the first descriptor in the document profile, then for the second, third, 
and so on to the end of the document profile. The same routine is then followed, using 
these same descriptors, for the next user profile, while a second document profile is 
being matched against the first user profile, and so on. In the present NASA SDI 
operation the MATCH cycle is repeated for over 700, 000 user -document profile pairs 
during each computer run. 

For each user -document profile pair, the computer asks the following 
questions in turn: 

1. Are any words, including any in phrases, the same? If not, the program 
goes to the next profile. 

2. Is at least one of any identical words a not descriptor? If so, the program 
goes to the next profile. 

3. Is one of the identical words a must ? If so, the program checks for 
not phrases. 

4. Are there enough identical words for a percentage match? If so, the 
program checks for not phrases. 

5. Do any phrases match; i. e. , all words in a 2- or 3-word phrase, at least 
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3 in a 4-word phrase, 3 in 5, 4 in 6, or 5 in 7? If so, the program 
checks for not phrases. 

6. Are any possible matches by must terms, percentage, or phrases barred 
by not phrases? (All words in a 2- or 3-word not phrase must be in the 
document profile to bar an announcement, but only any 3 words out of a 4- 
word phrase, etc. , just as with positive phrases. ) If so, the program 
goes to the next profile. If not, an announcement is written on the out- 
put notice tape. 

Output of the 7090/94 MATCH program consists of a document notice tape having the 
following storage layout in unsorted card image records: 

Location Content 


1 

Microfiche availability (#) 

2 

User's first initial 

3 

Document type (N for STAR, 


A for IAA) 

4 

User's second initial 

5 

Notice type 


Blank = may 


M = must 


P = phrase 


9 = random 

6-15 

User surname 

16 - 21 

User profile number 

23 - 32 

User address (first ten characters) 

33 - 38 

Document number (accession 


number minus N65- or A65-1 

40 - 50 

Next eleven characters of 


user address 

77 - 80 

Next four characters of user 


address 

81 - 86 

Issue number of abstract 


journal (formerly date) 

88 - 90 

Notice code (reason for match) 


Several changes from the format presented for the IBM operation of the program, 
as given in NASA CR-62021, pp. 185-186, may be noted. These changes are discussed 
in subsequent sections of this report. 

Computer run times. Estimates on the time required to complete the VOCON 

and MATCH programs for various number of users and documents processed are given 
in Table II. 
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TABLE II 


ESTIMATED 7090/94 COMPUTER TIMES (HOURS) 


Documents 

Users 

VOCON* 

MATCH 

Total 

1000 

500 

0.8 

1.0 

1.8 


700 

0.9 

1.4 

2.3 


1000 

1.0 

2.0 

3.0 

2000 

500 

1.5 

2.0 

3.5 


700 

1.6 

2.8 

4.4 


1000 

1.7 

4.0 

5.7 


* Add 15 seconds for each new user profile being entered 
into the system. 


This table is based on experience with the NASA SDI system utilizing an IBM 7094 II 
computer and processing 900 to 1500 documents per run for service to 700 to 800 users. 
Processing times were normalized to 700 users and 1000 documents. Other figures 
in the table were estimated on the assumptions that processing of document input 
profiles is proportional to the number of documents, that the rate of updating user 
profiles is relatively constant, and that required match time is proportional to the 
number of user-document profile pairs. Actual run times will, of course, vary with 
the activity in making user profile changes, number of index terms assigned to the 
average document, average length of user profiles, size of the dictionary, model of 
computer used, and other factors. Experience has shown that considerable machine 
time can be wasted if deficient housekeeping practices have resulted in poor quality 
tapes leading to numerous read and write redundancies. 

Random notifications . Randomly selected notifications (see NASA CR -62020, 
pp. 24-25) were intended to aid in profile modification. A user who indicated that a 
random notice was of interest was sent a tab card form notifying him that he had done 
so, listing the index descriptors assigned to the accepted random notice, and asking him 
if he wished to circle certain terms to be added to his profile (NASA CR-62020, p, 2.-25). 
The capability for issuing random notices was not utilized during the later phases of 
operation, the subroutines (NASA CR-62021, pp. 183-184) having been bypassed. In 
a decentralized operation, the capability could be reinstated. 

Random notice generation was dropped for several reasons: 

1. A fixed number of random announcements (actually a fixed maximum as 
specified by control card NRAND minus any that would have been selected 
or barred by the user's profile) are sent to every participant. This 
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number bears no relation to the total number of announcements the 
individual receives. For participants who receive few notifications, 
the random ones may be more numerous than those selected according 
to his expressed interests. 

2. If the number of random announcements sent to every participant is set 
low, 2 or 3 per 1000 documents processed, the number of chances for 
thus modifying the profile is relatively small in the course of a year. 

3. Relating the revealed index terms to the best formulation of a phase, 
negation, or other option in the interest profile is difficult. The user 
in most cases merely circles a single word, which if entered into his 
profile as an additional "may" term would have little or no result. 
Optimally, some phrase containing the circled word should be entered, 
perhaps with other qualifiers such as not phrases. This would be difficult 
for a reviewer to do solely on the basis of a returned card. 

Reestablishment of the random notification concept might be considered with 
changes in some of its concepts. Each random notice might be labeled as such, perhaps 
leading to more careful consideration by the user as to its actual relevance to his 
interests. "Of interest" responses to random notices might be sorted by user and 
considered collectively, perhaps being returned to the user with a copy of his profile 
and suggestions for changes. 
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SECTION 6 


OPERATIONS REPORTS 


The SDI operating staff can be provided with the following reports from each 
computer run of the update and match programs. 

1. Frequency list. Overall number of copies of each abstract required. 

2. User accession list. Printout by participant of each abstract selected 
for him with the reason for its selection. 

3. Input card error list. 

4. Dictionary changes. 

5. User transactions. Printout of all changes to the system, including 
new and deleted participants and their profiles, descriptors added to 
each profile or reasons for their rejection, deleted descriptors, etc. 

6. Document profiles. Printout of all index terms assigned to the documents 
taking part in the current run. 

For review purposes, the following reports are called for from time to time. 

1. User profiles. Printout of individual's active or historical interest 
profiles. 

2. Vocabulary control (dictionary) printout. 

Reports in the first group are generated by IBM 1410 and 1401 programs which 
are not documented nor described in the IBM final report. (An IBM 1401 could be used 
exclusively at some loss of speed. ) The preparation of these reports and their format 
is presented here for completeness. 

All these routine operating reports begin with the necessary sorts of the 
unsorted document notice output tape (NASA CR-62021, p. 185). This tape is first 
sorted by the document accession number (e.g. N65-12345, N65-12346. . .) and, as a 
minor sort, by the participating center (e.g. Langley, Lewis, ...). This is accom- 
plished on an IBM 1410 by program SDI -01, a standard 1410 operating system sort. 
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(Format and run times of this and other SDI auxiliary programs are given in 
Appendix A.) 

Frequency list . A 1401 program (SDI-10), using SDI-01 as input, lists on a 
a 1403 printer a double columnar arrangement indicating (1) the total number of notices 
generated for each document and (2) the total number of abstract cards to be produced 
for each. Column 2 is the sum of Column 1 plus some fixed number of abstract cards 
to be produced for other purposes than SDI. 

A sample page of a frequency list is illustrated in Fig. 9. The SDI-10 program 
totals Column 1 to give the sum of each journal issue. In the total, the following code 
symbols are used for the 1000's and 10, 000’s position; i.e., V678 = 15,678: 


# 

10 

Y 

18 

/ 

11 

Z 

19 

s 

12 

- 

20 

T 

13 

J 

21 

U 

14 

K 

22 

V 

15 

L 

23 

W 

16 

M 

24 

X 

17 

N 

25 


User Accession list . A listing by participant of the accession numbers of each 
selected announcement, together with the reason for its selection, have been provided 
the NASA SDI representative at each Center. A typical page of a user accession list 
is shown in Fig. 10. Symbols following each accession number (MUST; PHR 3,4; 

MAY .53, etc.) express the reason for its selection, as by a must term, three words 
out of a four-word phrase, a percentage match in which 53% of the words in the 
shorter of the user or document profile were identical, etc. Although the specific 
must term or phrase causing the match is not indicated, this can usually be observed 
quickly on comparing the user profile with the profile of the document in question. The 
user accession list is thus invaluable for improving user profiles. Reasons for 
selection of no-interest items can be determined and corrected. 

Certain features of the user accession list are noteworthy. For items selected 
by longer phrases, the actual number of matching words is given, as four for a four- 
word phrase, although fewer words, three in this case, would suffice for a match. 

The actual percentage of terms that matched for a may selection is shown, but this 
will never be less than a certain minimum, currently 50%. The minimum percentage 
for a may match is printed at the top of each page. It is important to note that alternate 
match possibilities are not given; that is, since the order of match is (1) must terms, 
(2) percentage match, (3) phrases, a must or percentage match might also have 
matched by a phrase. This possibility, which is important in profile modification, can 
usually be observed readily when comparing user and document profiles. 
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In order to obtain the user accession listing, it is necessary to make a 
separate sort of the document notice output tape. This is done by program SDI 03. 

This 1410 program sorts by (1) Center, (2) user's last name, and (3) document number. 
Output is used by a 1401 print program, SDI 13, to list the user accessions. These 
programs are used only because separate reports for the various locations served by 
the NASA SDI program are sent to the librarians at each location. For a system that 
would be operated only at one location, two similar programs, SDI 02 and SDI 12, which 
sort and print out only by user and document number, are available. 

Error list . Input card errors, such as an incorrect header card, are entered 
on the system output tape during Phase 2 of VOCON and are printed out at the same 
time as user transactions and other reports. 

The system operator minimizes input errors by having an EAM printout made 
immediately after the input cards have been keypunched. This listing is reviewed to 
make sure that all input cards are properly formatted. It may also be compared with 
the keypunch coding sheets. 

Dictionary change list . Document profiles and user profile changes are 
compared with the dictionary during Phase 4 of VOCON. If any document index terms 
are not already on the dictionary, they are added automatically. The operator is 
apprised of this by a printout from the system output tape on which the new descriptor 
and its coded value appear with the notation Descriptor added to dictionary . Information 
concerning descriptors added by the operator also appears on this printout; e.g. , 
Descriptor already on dictionary. Cannot be added . 

User transaction list . User profile transaction tapes, which are sorted by user 
profile number, carry all descriptors in alphanumeric as well as coded representation 
(see NASA CR-62021, pp. 89, 136). Printouts of all user profile changes are supplied 
for the operator's review during each production run. These printouts list added, 
rejected, and deleted descriptors, added user profiles with all terms, and deleted user 
profiles. This information had been placed on the system output tape during Phase 8 
of VOCON, and from which the user profile transaction printout is supplied routinely 
using a standard 1401 print program. 

Figure 11 is a sample page from the user profile transaction printout. It shows, 
in order, six descriptors added to an existing profile, a deletion of one phrase and 
addition of another to the same profile, a failure to delete a term because it was mis- 
spelled on the delete card, a rejection of a user profile, a change of address, addition 
of a single word, another address change, two additions of the Not foreign language 
capability, etc. Note that single terms bear no indication as to whether they are 
must, not, or may terms; this is shown, however, on any active user reports which 
have been requested. In the latter, the coded descriptor is included with the alpha- 
numeric, and the first octal number in the coded description provides this identification 
(see NASA CR-62021, p. 13). 
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At the end of the user transaction printout, all changes are summarized, a 
typical summary being shown in Fig. 12. 

The profile reviewer examines carefully each user transaction printout. A 
rejected profile may mean that no header card has been prepared. A rejected term 
may have been incorrectly coded to a nonexistent or previously deleted user profile. 
Consideration must also be given to the addition of rejected terms to the dictionary. 

Document profiles . The profile reviewer will frequently have reason to 
examine the index terms {document profiles) that have taken part in the match program. 
Consideration of the document profiles in conjunction with the announcements selected 
by a user interest profile — or failed to be selected if ’’misses" are being considered — 
is unquestionably the best way to improve the profile. In order to provide the reviewers 
with a copy of the docume nt profile in convenient form, the system is designed so that 
the stripped linear file ready to go to the machine coding process is printed out in the 
format shown in Fig. 13. This printout may contain more than merely subject index 
terms. Contract numbers and the expression Foreign- lang (see Section 4) are added 
from the appropriate document codes on the linear file record. If other linear file 
elements (see Section 3) were to be picked up, they would also appear on this printout. 
Additional information concerning each document is shown in the header line that begins 
and ends each document profile . This is an alphanumeric dump of the reformatted 
linear file input. The profile descriptors include published index terms as well as 
machine terms, as is evident from the presence of multiword descriptors. These are 
not distinguished by any marking, however, since in the 7090/94 program all multi- 
word descriptors are disassociated into their component parts before being entered into 
the coded document profiles. 

A rejected profile would have the work Rejected printed where Accepted other- 
wise appears. Its index terms would not be printed out. Rejection of a document 
profile or a document index term is rare, but may be caused by an error in the 
document header or by a misspelling. New descriptors will appear with the annotation 
Descriptor added to dictionary in the printout of dictionary changes. 

On completion of the document profile printout, the number of accepted and 
rejected profiles and descriptors is summarized, as in Fig. 14. 

User profile reports . In order to obtain an up-to-date copy of a particular 
user's profile, the profile reviewer requests either an "active” report or a "complete" 
report by entering input cards punched as prescribed in NASA CR-62021, p. 18. These 
reports, which are printed out in the user transactions during the next computer run, 
are known colloquially as "R-l" and "R-2" reports because these characters are key- 
punched in card columns 71 and 72. A page from an R-l, an active user profile report, 
which consists of current descriptors with their coded values and dates of entry, is 
illustrated in Fig. 15. A page from an R-2, a complete historical profile report, which 
includes all current and deleted descriptors, with dates of deletion, is shown in Fig. 16. 


32 


t Note that the date of entry or deletion may be either a calendar date or, as the result of a 
recent change, the issue number of the journal being processed. 

The first column of the coded descriptor gives information concerning the type 
of match in which it takes part (see NASA CR-62021, p. 13). A 0 indicates a may term; 
a 1, a must ; a 7, a not. 

Periodically, all current user profiles are printed out in a compact, double 
column format for distribution to the participants. On this printout, which is illustrated 
in Fig. 17, must and not terms are preceded by an M or N. The print program, which 
for 700 profiles requires about five minutes of 7094 time for formatting and 30 minutes 
of 1401 time for printing, is available as part of the general NASA SDI package. 

Vocabulary (dictionary) printout . This report is discussed in Section 5. 

A sample page is shown in Fig. 8. 

Master user record. For production of address cards, generation of user 
listings, etc. , a master file of tab cards is maintained which has the following format: 


Columns 

Content 

1-6 

Profile number 

8-10 

Location (Center) 

12 - 42 

Surname, initials, title, etc 

43 - 80 

Address 


To distinguish these cards from others entering the processing cycle, such as 
user-header and change cards, they are color coded. The profile editor simply writes 
"Green stripe" across the coding sheet on which he enters a master record. The key- 
puncher then selects the correct card stock. 
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SECTION 7 


ANNOUNCEMENTS 


A sample announcement such as participants in the 7090/94 NASA SDI system 
received is pictured in Fig. 2. It consists of an abstract card and a response card, 
delivered in a sealed window envelope. Details of the abstract and response cards are 
given in Figs. 18, 19, and 20. Significant changes will be noted in comparison with the 
earlier announcements shown in NASA CR-62020, p. 21. 

Response card . The important response card, also referred to as the 
notification-evaluation card, has been modified by (1) changing the date, which was 
originally the date on which the announcement was mailed, to a more meaningful symbol 
for the journal in which the abstract appears (N for STAR, A^for IAA) and the issue 
number (01 to 24), and (2) adding a "#" symbol to indicate the availability of the 
document in microfiche (cf. Fig. 18). 

The first of these changes assists the user in comparing the announcements 
he received with the corresponding issue of the journal so as to evaluate the service 
being received. The second change aids his librarian in recognizing that a request 
for a document might be filled with the convenient microfiche. 

The format for the response card is pictured in Fig. 19. For an independent 
operation, the card could be redesigned. 

Data to prepare the response cards are available on the 7090/94 system output 
tape. From this the response cards are punched with the user's code number, name 
and address, accession number of the selected document, issue from which it was 
taken, and a "#" symbol to indicate that microfiche copies of the document were 
available. Punching is done by program SDI 11 (See Appendix A) using a 1401 computer 
and input from SDI 01. The punched cards are in order by accession number with a 
minor sort by Center. After the response cards are punched they are interpreted on 
an IBM 557 Alphabetic Interpreter. Although a wiring diagram of the IBM 557 to be 
used for interpreting blue cards is given in NASA CR-62021, p. 187, it is not complete- 
ly up to date, as the date was later changed to journal and issue number, and the 
symbol was added. 

The following are updated plug board entries for the NASA SDI response card 
interpretation: 
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Entry 1, Print line 16 


Card Column Print Position 


2 

39 — First Initial 

4 

40 — Second Initial 

6-15 

42-51 — Last Name 

26-28 

56-58 — Center Code 

Entry 2, Print line 17 


16-21 

17-22 — User Profile Number 

3 

24 — Document Type 

72 

74 

^ l Issue Number 
26 — ) 

34-38 

28-32 — Accession Number 

1 

34 — Microfiche Code 

23-25 

39-41 —) 

29-32 

40-50 

^ Internal Address 

46-56 — ) 

77-80 

57-60 — ) 

Port-A-Punch Positions: 


Of Interest Document Requested 

occurs as 12 zone punch in col. 71 

Of Merest* Document Not Requested 

occurs as "x" or 11 zone punch in col. 73 

Of Interest-Have Seen Before 

occurs as zero punch in col. 75 

Of No Interest 

occurs as 2 punch in col. 73 

Comments 

occurs as 3 punch in col. 75 


Several card columns (5 and 33) are not interpreted. The information punched in these 
columns is used in obtaining printouts of user notices and the reasons for their 
selection (See Section 6). 

Abstract card . The abstract card, a sample of which is illustrated in Fig. 20, 
is obtained in the following manner; When a report is accessioned by the NASA 
Facility, it is cataloged and abstracted. An accession number is assigned for future 
use in ordering copies or reproductions. Abstract and citation are typed with 
simultaneous generation of a paper tape. The latter is used to drive a photographic 
composition machine, which produces a justified galley on undeveloped photographic 
paper. The exposed paper is developed photographically through standard darkroom 
techniques. After proofreading corrections are made, the resulting galley, which is 
in 12 point Uni vers Bold type, is the raw material from which both the offset printed 
STAR and the NASA SDI abstract cards are produced. 

Composition of IAA is an independent operation by the American Institute of 
Aeronautics and Astronautics, New York City. Abstracts are typewritten, not 
photocomposed, and are not justified. A high standard of legibility is obtained in the 
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# 

final output, however. The final galleys are forwarded to the NASA Facility for 
production of abstract cards for those announcements selected from IAA . 

When the galleys for STAR and IAA are received, they are reduced photo- 
graphically. It can be noted that the NASA SDI abstracts are in smaller type than in 
their corresponding appearance in STAR and IAA . All abstract cards are reproduced 
at the same reduction from the original copy. The original 12-point output is reduced 
to 70 per cent for the abstract journals, to 55 per cent for the abstract cards. 

This reduction ratio was selected to minimize the number of cards on which 
the abstract must occupy two columns. A statistical analysis of STAR , Vol. 3, 
Issues 1 to 3, showed that only 2.2 per cent of all abstracts required a second 
column. IAA abstracts were somewhat longer, 6.5 per cent running into a second 
column, or a combined average of 3.8 per cent. 

The photographs of the abstracts are cut up and arranged by accession number 
for reproduction. The actual number printed is determined by a frequency list 
obtained as part of the computer matching (Fig. 9). This gives the total number of 
each announcement required by all participants, perhaps only 4 or 5 of a special 
interest item, 40 to 50 of a document of broader content. 

After multilith reproduction of the desired number of announcements, the 
resulting six-announcement sheets are cut carefully to 7 3/8 inches long by 3 inches 
wide. 


Distribution of abstract cards having a high graphic arts quality is a unique 
feature of the NASA SDI system, and one of the most popular. A questionnaire 
(NASA CR-62020) showed that 90% of all recipients were filing some of the cards 
that they received. The fact that the cards are reproduced from the journal abstracts, 
rather than printed by computer, means that special symbols and mathematical 
formulas can be presented, and of course makes for excellent readibility. 

Microfiche of abstracts . An independent operator of this SDI system who 
wishes to supply abstracts to his participants may reproduce them from microfiche 
of abstracts. Microfiche of all abstracts in STAR and IAA issues are available to 
NASA Centers and contractors on request. 

Envelope insertion . After the required number of abstract cards have been 
reproduced, the blocks of identical cards are separated by any convenient dividers, 
such as two or three colored cards stapled together. A number of blocks of cards 
are then inserted in one of the hoppers of a standard business envelope inserter. The 
blue response cards, after they have been interpreted, are handled in the same way. 


A block of response cards, each of which is interpreted with the same 
accession number but, of course, with different user names, is separated from the 
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succeeding block by a manually inserted divider, as with the abstract cards. The 
envelope inserter is then started, and operation continues until stopped by the divider 
between the blocks of blue response cards or by the action of the operator. The 
operator then discards the remainder of the corresponding white abstract cards in the 
other hopper, and again starts the machine, repeating these steps until all blocks of 
cards have been inserted. This procedure assures that the proper abstract card 
accompanies each response card. 

Careful adjustment of the envelope inserter is essential to avoid inserting three 
cards in one envelope or inserting only one. Another important check on system out- 
put is an inspection of the abstract cards by a quick ruffle of each block of abstract cards 
before stacking them in the envelope inserter. This assures that all cards have been 
printed properly. 

Envelopes . Any window envelopes of suitable size for abstract card insertion 
and in which the name and address on the response card registers with the window are 
satisfactory for NASA SDI use. It is important that the adhesive area or areas on the 
flaps be of optimum dimensions for ease of opening the sealed envelopes by the 
recipient. 
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SECTION 8 


HANDLING DOCUMENT REQUESTS 


Requests by NASA SDI participants for the full text of announced documents 
have been brought to the awareness of library personnel by their receipt of the blue 
response cards after these had been punched by the user and dropped in his outbox. 
Library practices in handling these returned cards vary among the different NASA 
Centers. At smaller Centers, the requested cards are scanned visually to separate 
those punched "Of interest, document requested. " At Centers with larger numbers 
of participants, all the returned response cards may be sorted on EAM equipment to give 
a pull list, as well as to generate local response statistics. 

Once the librarian has separated those cards requesting documents or has a pull 
list printed, the requested items may be transferred to the usual library request form 
or, to save time, may be filled more expeditiously, as by pulling copies of microfiche 
from a file and forwarding without keeping a record. Because the second response card, 
which had been used as an address card (NASA CR-62020, pp. 22-3) was later dropped 
from the system, some libraries, to save the time of addressing a routing slip, have 
adopted the practice of using the response card as an address card, simply attaching it 
to the copy of the document before entering it into the mail system. This practice, 
however, interferes with the evaluation of response statistics. A certain fraction of 
response cards which indicate that announcements are of interest are either returned 
late, or, since the participant has to take the responsibility of returning the same card 
twice, are not returned at all. 

As these are all positive responses, it can not be assumed, as is done in some 
SDI systems, that non-retumed response cards should be considered as representing 
no-interest announcements. 

Librarians also have the problem of identifying the item requested. Only 
accession numbers appear on response cards as document identification. This is 
sufficient for pulling microfiche on file, but most full-text items announced in STAR 
and IAA are not kept on file by the accession number. This is notably true of the 
journal articles announced in IAA . Items could be identified by referring to accession 
cards or to the journal issues, a time-consuming operation. In order to aid in this 
identification, at least one NASA Center library has asked its SDI participants to send 
the abstract card to the library with the response card when requesting a document. 
When the document request is filled, the abstract card is returned to the requestor. 
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Microfiche. SDI would not be practical unless the participant could obtain 
promptly the full text of the announced report or journal article. Even the present 
NASA SDI developmental system has 700 participants located at 21 research centers. 
Some of these centers are so large in area as to have more than one library facility. 

To have an original copy of each document at each point of need is manifestly 
impossible. NASA SDI here takes advantage of a basic tool that has been adopted for 
the rapid communication of scientific and technical information to all NASA's audiences: 
the negative, unitized flat microform known as microfiche. 

Microfiche of reports are prepared by the NASA Facility. Incoming reports 
are first microfilmed at an 18-to-l reduction, arranged in a 105 x 148 mm (approxi- 
mately 4x6 inch) format, and a master transparent diazo negative is prepared. 

Multiple diazo copies are then made for widespread distribution. 

The 4x6 microfiche has been adopted as standard by most Government 
agencies. It can hold 60 page images, with three images devoted to eye-legible 
information for guidance of the user. Seventy-two additional page images can be placed 
on each trailer fiche. Continuous quality control assures that even a third or fourth 
generation microfiche can serve as a master for further duplication or blow-back 
reproduction. Negative rather than positive microfiche was decided upon because 
heavy field use was anticipated for reproduction. Microfiche of reports are widely 
distributed to NASA Centers and contractors, other Government agencies, and 
domestic and foreign libraries. They are sold to the public by the Clearinghouse for 
Federal Scientific and Technical Information, Springfield, Virginia. 

In contrast to the availability of reports in the convenient microfiche form, 
journal articles are not so readily available in this form because of the copyright 
problem. However, the American Institute of Aeronautics and Astronautics, by 
special arrangements, manufactures and sells microfiche of about 50% of all the items 
announced in IAA . Microfiche of documents announced in IAA are not available from 
NASA. Organizations other than NASA Centers must arrange with the AIAA for their 
own purchase or subscriptions. 

Availability of microfiche is indicated by a "#" symbol following the accession 
number (Figs. 18 and 20). This appears in the abstract journal, on the abstract card, 
and on the response card. 

Demand on library services . Establishment of an SDI program at a research 
center has invariably been followed by a large increase in the demands placed on the 
local library service. This is understandable. Consider the individual who has just 
opened an SDI announcement and finds it to be of interest. Will he order a full-text 
copy? All he needs to do is put a pencil point down on the "Of interest, document 
requested" block, punch it out, and toss the response card in his outbox. It is 
preaddressed on the back to his local library. Certainly he will order more documents 
than someone who must search the literature himself, locate a library request form, 
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telephone the library, or write a memorandum. Experience suggests that the average 
SDI participant requests three times as many items from the library as non- 
participants do. Even so, SDI users do not order documents carelessly. In fact, only 
10 to 20 per cent of announcements lead to document requests. Instead, the increase 
seems to be clear evidence that the typic al scientist or engineer had not previously 
been making adequate use of the literature and had not had a simple, easy method of 
requesting library material. 

The increased demand on library services must, of necessity, be filled by 
copies of the original document; that is, by reproductions, more-or-less full-sized, 
or by microcopy. Within NASA, as its Centers are semi-autonomous in their operations 
and vary greatly in staff and facilities, library practices in filling requests also vary. 

In general, requests are filled by microfiche wherever possible. Some NASA Centers 
file sufficient microfiche copies to fill anticipated requests, simply pulling one copy 
as needed. Other Center libraries keep only a master microfiche, and duplicate a copy 
to forward to the requestor as needed. 

Since NASA SDI was initiated, there has been a dramatic positive change in the 
general user acceptance of microfiche. Partly, this is the result of increased avail- 
ability of microfiche readers. Individuals who once had little favorable to say about 
microcopy of any type, now find it convenient to take a quick look at a microfiche, 
which is often enough for their purposes. 

If the microfiche recipient still needs an eye-legible copy, this is furnished on 
request. Some libraries are able to meet all demands for large copy by blowback 
from microfiche. In addition to filling requests by their own duplication activities, 
NASA libraries may also request copies of reports from the NASA Facility. If 
reproduction is needed, the Facility uses the microfiche as a source to produce two 
images each on an 8 l/2 x 11 inch sheet of photographic paper. Experience to date 
has indicated little objection to this less-than-full-size (60%) image, primarily because 
of its superior quality in comparison to that of copies prepared by other common re- 
production methods. 
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SECTION 9 


SYSTEM EVALUATION 


Each NASA SDI announcement has been accompanied by a response card 
(Fig. 18), which serves a variety of useful purposes: address card, document order 
form, comment card, and source of a quantititive measure of the quality of service 
being provided the participant. By compiling the responses ("Of interest, etc.") 
punched by the recipients of announcements, the NASA SDI operator can determine 
which individuals are receiving unsatisfactory service (a low percentage of hits), and 
therefore need assistance in profile revision. By comparing responses to time- 
sequenced batches of announcements, as from successive issues of the associated 
abstract journals, the operator can evaluate the effects of profile changes. 

Response card processing . Deciding the extent to which response evaluation 
should be carried poses questions of utility and cost effectiveness, particularly in view 
of the large number of response records that must be sorted. With 700 NASA SDI 
participants, returned responses total about 35, 000 cards during each twice-monthly 
operating period. Sorting these by Center, participant’s name, each of the four 
possible responses, and journal issue requires significant time on either EAM equip- 
ment or internally in a computer. 

The following simple procedure has been used to obtain (1) a measure of overall 
system effectiveness and (2) a measure of the degree of personal satisfaction with the 
system being experienced by selected individuals. All returned response cards (14 or 
15 tab card boxes) are sorted twice monthly on an EAM sorter. The cards for each 
individual are then tallied on an accounting machine. A sample page of the resulting 
tabulation is shown in Fig. 21. Total response figures for the overall system are also 
obtained. 

With all its limitations, this tabulation finds steady use in evaluation of 
individual responses. Used in conjunction with user accession lists (Sect. 6) for a 
given journal issue, the tabulation is examined visually to alert the profile reviewer to 
individuals who are not returning response cards or who are indicating a low ratio of 
hits to announcements. 

Ideally, responses from each participant should be tabulated for each issue’s 
announcements, in both absolute numbers and calculated percentages. The number 
of returned response cards as a percentage of total announcements should also be 
indicated. 
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A POST program utilizing a 1401 card-to-tape routine, a 1410 sort, a 1410 
main tabulation and format program (POST), and a 1401 print run to accomplish this 
evaluation has been considered, but programming effort has not been applied to date . 

A second approach to full response evaluation could be through modification of 
a POST program already developed for the IBM 7090/94 data processing system. This 
program, documented in NASA CR-62021, pp. 209-14, tallies all user responses and 
maintains a historical notice-response file. Its output is illustrated in Fig. 22. In 
this figure, responses are indicated by a "+" for Of interest, document requested , a 
for Of interest, document not wanted; a "0" for Have seen before , and a ”2" for 
Of no interest . Reasons for the match are also indicated; two figures represents the 
percentage on which a match took place; "122" and "133" represent two-word and 
three-word phrase matches, respectively, "143" represents a phrase match on three 
words out of four; 300 indicates that a match took place on a must term; and 222 is the 
same as 122, the initial 2 indicating that the phrase had been musted (no longer done). 

As may be seen, user responses are maintained and reported in document 
number order , not by user name. This is a consequence of emphasis in the parent 
IBM series of SDI systems on (1) use of second response cards (NASA CR-62020, p. 22) 
to be used as address and evaluation cards for requested documents (a procedure later 
dropped by the NASA SDI system), and (2) recording the number of copies of documents 
requested and copies remaining in stock (in the NASA SDI system an "infinite" number 
of available copies is assumed). Reprogramming to sort by user number or name, 
calculate percentages, and print-out statistics in suitable format would be necessary 
to utilize the POST program for full user response evaluation. No programming effort 
has been applied to date in this direction. 

Number of announcement s. How many announcements will be selected for each 
participant, and therefore, how many abstract cards must be prepared for the overall 
system, depends on many factors. Factors under the operator’s control include 
(1) minimum percentage required for a "may" match, (2) the number of words re- 
quired to match in longer phrases, and (3) most important, the degree of specificity 
written into the user interest profiles. Factors generally not under the operator's 
control include the number of citations in the journal issue and the number of subject 
index terms assigned to the document input. 

Table IH presents experience on the average number of announcements 
received by approximately 700 participants in the NASA SDI program during April 1 
to August 30, 1965, a period when operating conditions and the nature of the profiles 
were relatively constant. 

Return of response cards. For a meaningful evaluation of user responses, the 
return of almost all response cards is crucial. Experience has shown the difficulty 
of obtaining a good rate of return. Table IV lists the percentages returned by 
July 15, 1965, of all cards distributed since January 1, 1965, by approximate date 
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Table IV 


Rate of Returned Response Cards 
IAA STAR 


Issue 

Date 

(1965) 

Per cent 
Returned 

Issue 

Date 

(1965) 

Per Cent 
Returned 

1 

Jan. 1 

75 

1 

Jan. 8 

77 

2 

Jan. 15 

73 

2 

Jan. 23 

75 

3 

Feb. 1 

74 

3 

Feb. 8 

80 

4 

Feb. 15 

75 

4 

Feb. 23 

86 

5 

Mar. 1 

70 

5 

Mar. 8 

73 

6 

Mar. 15 

67 

6 

Mar. 23 i 

65 

7 

Apr. 1 

56 

7 

Apr. 8 

57 

8 

Apr. 15 

51 

8 

Apr. 23 

58 

9 

May 1 

60 

9 

May 8 

59 

10 

May 15 

55 

10 

May 23 

52 

11 

June 1 

39 

11 

June 8 

45 
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of distribution. These figures should be recognized as the experience of an SDI system 
serving participants at 21 locations, with limited direct contact with participants. Card 
receipt and evaluation are also once removed from the individual user. That is, cards 
move from user to local library to system operator. An independent operation with all 
or most of its participants serviced by one library, which also would evaluate their 
responses, should find that meaningful response evaluations could be generated more 
promptly following the distribution of announcements. 

System performance . A measure of SDI system effectiveness is the percentage 
of "hits" i. e. , announcements of interest as related to the total number of announce- « 
ments distributed. In practice, this percentage is calculated as the ratio of (1) the 
returned response cards that have been rated as of interest to (2) the total number of 
returned cards. For individual participants, hit percentages vary greatly, from 10 
to 20 per cent for poorly conceived and inadequately reviewed interest profiles up to 
80 to 85 per cent for well written profiles. A few individuals indicate interest in an 
even higher percentage of announcements they receive but these often have a very 
concise or otherwise unusual profile . 

Average percentage responses for the overall system of 700 users during the 
first six months of 1965 showed the following ranges for announcements selected from 
individual journal issues: 


Of interest, 

document requested 

Per cent 
10 - 16 

Of interest, 

document not wanted 

33 - 43 

Of interest, 

have seen before 

2-4 

Of no interest 

38 - 48 


Total of interest (hits) 

52 - 62 


It should be emphasized that these are percentages of returned response cards, 
not of total announcements distributed. 

Costs . So many factors enter into the selection and distribution of an SDI 
announcement that it is difficult to estimate the cost of an operating system having a 
given number of participants, searching a certain number of citations, using particular 
equipment, etc. Actual costs will be a function of the number and type of participants, 
volume of input, degree of integration with other information programs, number of 
profile changes (which in turn is determined largely by the operator's interest in 
stimulating profile improvement), and computer availability, as well as administrative 
distribution of supervisory costs, computer operator time, and other expenses over 
a number of operations. 


In Table V, some representative costs of an SDI system as it might be operated 
at approximately the size of the present NASA program are presented. The unit costs 
are based on actual experience with the NASA IBM 7090/94 program. The figures do 
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not provide for economies achieved by integration with other operations, such as the 
printing of additional abstract cards for other purposes, as has been done during 
operation of the NASA SDI program. This would reduce the unit costs shown. A 
substantial reduction in the unit cost of computer profile update and match would 
be gained by use of other than prime time on a 7090/94 computer, as is done by NASA. 
An announcement medium other than the abstract and response cards would alter the 
overall costs significantly. 

The figures in Table V do not include direct labor for machine operation, other 
than 7090, since the manhour-per-machine-hour factor is variable among independent 
organizations, being dependent not only upon manning policy and configuration but also 
upon the efficiency of machine utilization. Supervision, which is also a function of 
management policy, is not included in any of the representative costs. 
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Table V 


Representative NASA SDI Costs Per Announcement 


Abstract preparation 1 . 80 

Includes photographic reduction of 
journal galleys and plate making 

Abstract printing 1.30 

Includes offset plate preparation, 
printing and cutting 

Computer update and match 5.60 

Rent of IBM 7094 assumed to be $500 
per hour 

Off-line machine operation 0.90 


1410, 1401 and EAM used for response 
card punching, interpretation, and 
operational report printouts 

Announcement assembly and dispatch 1.70 

Includes envelope insertion, boxing, 
mailing and amortization of envelope 
inserter 

Profile editing and maintenance 1.80 

Includes profile review, correspondence, 
and keypunching change cards 

Statistical response analysis 0.40 

Includes EAM sorting 

Cost per announcement 13.50 


49 


APPENDIX A 


AUXILIARY NASA SDI COMPUTER PROGRAMS 

There are currently six 1401, and three 1410 programs in the SDI system. All the 1410 
programs are sorts which utilize the standard 1410 operating system sort. Tape formats 
are available as part of the system documentation. 

1. SDI 01 (1410) 

a. Sorts the NOTICE output tape from the 7094 MATCH program. 

MAJOR DOCUMENT NUMBER 

MINOR CENTER (LOCATION) 

b. The output of this sort is utilized by programs SDI 10, SDI 11, and FORMT. 

c. Approximate run time per 15, 000 notices - 15 Min. 

2. SDI 02 (1410) 

a. Sorts the NOTICE output tape from the 7094 MATCH program. 

MAJOR USER NUMBER 

MINOR DOCUMENT NUMBER 

b. The output of this sort is utilized by SDI 12. 

c. Approximate run time per 15, 000 notices - 20 Min. 

d. Tape formats are identical to SDI 01. 

In that the program SDI 12 (see below) is no longer required, SDI 02 is not 
currently in operation. 

3. SDI 03 (1410) 

a. Sorts the NOTICE output tape from the 7094 MATCH program. 

MAJOR CENTER (LOCATION) 

INTERM USER LAST NAME (1st 10 digits) 

MINOR DOCUMENT NUMBER 

b. The output of this sort is utilized by SDI 13. 

c. Approximate run time per 15, 000 notices - 20 Min. 

d. Tape formats are identical to SDI 01. 

All 1410 sorts could be run on a 14:01 with at least four tape drives and an 
8K core storage should no 1410 be available. However, a 1410 can perform 
the sorts 30-40% faster. 

Any 1410 configuration having at least 5 tape drives can handle these sorts. 
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4. SDI 10 (1401) 

a. This program lists on the 1403 printer a double columnar arrangement 
indicating: 

1. The total number of notices generated for each document. 

2. The total number of abstract cards to be produced for each document. 

Currently, the figure for 2. is the total for 1. + 50. The total of all notices 
produced for all documents is also provided. See Section 6 and Figure 9. 

b. Input file is the output of SDI 01. 

c. Approximate run time - 15 Min. 

5. SDI 11 (1401) 

a. This program punches the notices into the blue response cards. The two 
digit issue number, as supplied by a control card, is punched in columns 72 
and 74. 

If the document microform code is a 1, it is replaced for interpretation by a 
"#. " If the code is other than 1, it is removed. 

b. Input file is the output of SDI 01. 

c. The response card format is shown in Figure 19. 

d. Approximate run time - 200 notices/minute. 

6. SDI 12 (1401) 

a. This program produces a listing on the 1403 printer which is a breakdown by 
each user of those notices (documents) which have been selected from the 
current issue. In addition, a reason for the notification is also indicated, 
i.e., whether the notice is the result of a MAY percentage, a MUST or a 
PHRASE (2 out of 2, 3 out of 3, etc.). See Section 6. 

b. Input file is the output of SDI 02. 

c. Approximate run time for 15, 000 notices - 30 Min. 

d. The need for this program has been obviated by SDI 13 so it was discontinued 
as of Issue 08. 

7. SDI 13 (1401) 

a. This program is identical to SDI 12 except that the user-document breakdown 
is found within the NASA centers. See Section 6 and Figure 10. 

b. Input file is the output of SDI 03. 

c. Approximate run time for 15, 000 notices - 30 Min. 

8. FORMT (1401) 

a. This program converts the sorted notices, output of SDI 10 (short form)into 
a format acceptable to the 7094 POST program (long form). 

b. Input file is the output of SDI 01. 
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c. Output file is input to POST. 

d. No running times are available. 


NPOST (1401) 

a. This program takes the NOTICE cards returned by the users (now referred to 
as RESPONSE cards) as well as any necessary control cards, and loads them 
onto tape as input to the POST program. Detailed write up is found in report 
NASA CR-62021, pp. 203-208. 

b. No run times are available. 

All 1401 programs can be run on a system consisting of: 8K core, two 7330/ 
729 tape drives, 1402 reader-punch, 1403 printer, and Advanced Programming 
Package. If sorting is to be done on the 1401, 4 tape drives are required. 
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NASA/SDI INTEREST PROFILE 
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Figure 7* Subject Authority List 
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USER NOTICES DISTRIBUTED BY THE NASA SELECTIVE DISSEMINATION OF INFORMATION SVSTEM 
HOU STAR18 MINIMUN MATCH FOR MAY - 50* 07SEP65 
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CL0UO SURVEILLANCE 



Figure 11. User Transaction List 



Figure 12. User Transaction Summary 
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Figure 13 . Document Profile List 



SUMMARY 0F ADDITI0NS AND OELETI0NS T0 DOCUMENT TAPE 
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Figure lA. Document Profile Summary 
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Figure 16. Historical User Profile Report 
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Figure 17. Sample User Profile 
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Figure 21. Statistical Response Tabulation Figure 22. POST Printout 



